Filling Statistics with Linguistics -- Property Design for the Disambiguation of German LFG Parses

نویسنده

  • Martin Forst
چکیده

We present a log-linear model for the disambiguation of the analyses produced by a German broad-coverage LFG, focussing on the properties (or features) this model is based on. We compare this model to an initial model based only on a part of the properties provided to the final model and observe that the performance of a log-linear model for parse selection depends heavily on the types of properties that it is based on. In our case, the error reduction achieved with the log-linear model based on the extended set of properties is 51.0% and thus compares very favorably to the error reduction of 34.5% achieved with the initial model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving coverage and parsing quality of a large-scale LFG for German

We describe experiments in parsing the German TIGER Treebank. In parsing the complete treebank, 86.44% of the sentences receive full parses; 13.56% receive fragment parses. We discuss the methods used to enhance coverage and parsing quality and we present an evaluation on a gold standard, to our knowledge the first one for a deep grammar of German. Considering the selection performed by our cur...

متن کامل

TIGER TRANSFER Utilizing LFG Parses for Treebank Annotation

Creation of high-quality treebanks requires expert knowledge and is extremely time consuming. Hence applying an already existing grammar in treebanking is an interesting alternative. This approach has been pursued in the syntactic annotation of German newspaper text in the TIGER project. We utilized the large-scale German LFG grammar of the PARGRAM project for semi-automatic creation of TIGER t...

متن کامل

Parsing Bangla using LFG: An Introduction

This paper is introduces LFG (Lexical Functional Grammar) formalism for parsing Bangla. The LFG formalism, which has evolved from extensive computational, linguistic, and psycholinguistic research, provides a simple set of devices for describing the common properties of all natural languages and the particular properties of individual languages. This paper tabulates a set of instructions for us...

متن کامل

F-structure Transfer-based Statistical Machine Translation

In this paper, we describe a statistical deep syntactic transfer decoder that is trained fully automatically on parsed bilingual corpora. Deep syntactic transfer rules are induced automatically from the f-structures of a LFG parsed bitext corpus by automatically aligning local f-structures, and inducing all rules consistent with the node alignment. The transfer decoder outputs the n-best TL f-s...

متن کامل

Tagging and Morphological Disambiguation of Turkish Text

Automat ic text tagging is an important component in higher level analysis of text corpora, and its output can be used in many natural language processing applications. In languages like Turkish or Finnish, with agglutinative morphology, morphological disambiguation is a very crucial process in tagging, as the structures of many lexical forms are morphologically ambiguous. This paper describes ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007